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Introduction 


The  goal  of  this  innovator  award  is  to  continue  to  develop  and  apply  RNAi-based 
screening  methods  to  discover  new  routes  toward  breast  cancer  therapy.  The  project 
has  three  sets  of  goals.  First  is  to  integrate  genomic  and  genetic  information  on 
available  breast  cancer  cell  lines  to  identify  tumor-specific  vulnerabilities  and  to 
understand  genetic  determinants  of  therapy  resistance.  Second  is  to  probe  the  roles  of 
breast  cancer  stem  cells,  with  a  particular  emphasis  on  microRNAs.  The  third  is  to 
examine  regions  that  determine  familial  susceptibility  to  breast  cancer  by  applying 
novel,  focal  re-sequencing  methods  developed  in  the  laboratory. 

Body 

Fourth-generation  RNAi  libraries 

In  collaboration  with  Scott  Lowe  and  Steve  Elledge,  we  developed  a  multiplexed 
validation  assay  for  measuring  shRNA  potency,  termed  the  “sensor  assay.”  We  used 
this  to  generate  approximately  250,000  measurements  of  shRNA  efficacy,  the  largest 
such  dataset  ever  generated.  This  provided  sufficient  information  that  we  could  devise 
a  predictive  algorithm,  which  we  term  shERWOOD,  that  can  essentially  predict  the 
results  of  functional,  sensor  testing  of  shRNAs  in  silico.  We  are  presently  developing  a 
web  site  that  will  make  this  tool  available  to  the  community,  but  we  are  also  working 
toward  the  development  of  fourth-generation  shRNA  libraries  based  upon  the  approach. 

All  algorithms  that  predict  effective  RNAi  tools  tend  to  choose  sequences  that  being  with 
a  U.  This  is  thought  to  have  a  structural  basis  in  the  interaction  between  the  RNA  and 
Argonaute,  the  key  core  of  the  RNAi  effector  complex.  That  5’  residue  has  been  shown 
to  reside  in  a  binding  pocket,  which  favors  interaction  with  U.  Therefore,  the  sequence 
space  available  for  effective  RNAi  tools  is  really  restricted  to  only  %  of  the 
transcriptome.  When  the  small  RNA  interacts  with  Argonaute,  its  5’  end  is  not  available 
for  pairing  to  the  target  RNA.  Therefore,  even  though  the  5’  U  contributes  to  RISC 
binding,  it  is  irrelevant  to  target  recognition.  We  therefore  tested  the  idea  that  we  could 
expand  available  sequence  space  by  simply  releasing  the  aforementioned  constraint,  in 
essence  predicting  on  every  positing  in  the  transcriptome  and  changing  the  small  RNA 
guide  that  would  pair  to  that  site  so  that  it  contained  a  5’  U.  This  produced  even  higher 
scores  in  the  algorithm  and  was  especially  important  for  small  genes  with  limited 
numbers  of  potential  target  sequences. 

We  generated  a  collection  of  1,320,000  oligonucleotides  corresponding  to  shERWOOD 
shRNA  predictions  encompassing  the  human,  mouse,  rat,  and  fly  genomes.  These 
used  both  the  conventional  genomes  and  the  1U  strategy.  We  confined  our  fourth 
generation  libraries  to  REFSEQ  genes  and  employed  a  series  of  heuristics  to  maximize 
the  likelihood  that  our  target  sites  would  fall  within  constitutive  exons.  We  have  cloned 
these  into  our  basic  shRNA  expression  vector  and  begun  to  sequence  verify  and  array 
the  resulting  libraries.  Thus  far,  we  have  a  unique  collection  of  -60,000  shRNAs  for  the 
human  genome,  and  work  on  mouse  and  rat  is  ongoing.  We  have  also  signed  a  term 
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sheet  with  a  distribution  company  to  make  the  materials  available  and  have  begun  to 
validate  fourth  generation  shRNAs  in  a  variety  of  rigorous  assays. 

Screening  cell  lines  for  new  therapeutic  targets 


Over  the  past  year,  we  have  continued  to  work  toward  large  scale  screens  of  breast 
cancer  cell  lines.  We  have  continued  to  have  difficulty  with  drug  sensitivity 
measurements  provided  by  collaborating  groups.  However,  we  have  managed  to  make 
substantial  progress  on  in  vitro  screening.  A  table  with  the  current  status  of  this  effort  is 
presented  below. 


Breast  Cancer  Cell  Lines 

Screening  Conditions 

#  of  timepoints 

Status 

Her2+  treatment  cateqorv 

JIMT1 

straight  lethal 

T=1 1 

Screen  completed/seguenced 

JIMT1 

lapatinib  IC20 

T=1 1 

Screen  completed/seguenced 

MDA-MB-453 

straight  lethal 

T=5 

Screen  completed/seguenced 

MDA-MB-453 

lapatinib  IC20 

T=5 

Screen  completed/awaiting 
sequencing 

MDA-MB-361 

lapatinib  IC80 

T=4 

Screen  completed/awaiting 
sequencing 

EFM192A-TR 

straight  lethal 

T=4 

Screen  completed/awaiting 
sequencing 

EFM192A-TR 

trastuzumab  (15ug/ml) 

T=4 

Screen  completed/awaiting 
sequencing 

HCC1954 

straight  lethal 

T=4 

Screen  completed/microarray 
analysis  completed 

ER+  treatment  cateqorv 

MCF-7  Parental  +  E2 

straight  lethal 

T=4 

Screen  completed/awaiting 
sequencing 

MCF-7  Parental  +  E2 

without  E2 

T=4 

Screen  completed/awaiting 
sequencing 

MCF-7  Parental  +  E2 

without  E2/  plus  tamoxifen 

T=3 

Screen  completed/awaiting 
sequencing 

MCF-7-EDR 

with  E2 

T=4 

Screen  completed/awaiting 
sequencing 

MCF-7-EDR 

without  E2 

T=4 

Screen  completed/awaiting 
sequencing 

MCF-7-TAMR 

with  E2 

T=4 

Screen  completed/awaiting 
sequencing 

MCF-7-TAMR 

without  E2/  plus  tamoxifen 

T=4 

Screen  completed/awaiting 
sequencing 

ZR-75-1  Parental  +  E2 

straight  lethal 

T=4 

Screen  completed/awaiting 
sequencing 

ZR-75-1  Parental  +  E2 

without  E2 

T=4 

Screen  completed/awaiting 
sequencing 

ZR-75-1  Parental  +  E2 

without  E2/plus  tamoxifen 

T=4 

Screen  completed/awaiting 
sequencing 

ZR-75-1  -EDR 

with  E2 

T=4 

Screen  completed/awaiting 
sequencing 

ZR-75-1  -EDR 

without  E2 

T=4 

Screen  completed/awaiting 
sequencing 
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ZR-75-1-TAMR 

with  E2 

T=4 

Screen  completed/awaiting 
sequencing 

ZR-75-1-TAMR 

without  E2/plus  tamoxifen 

T=4 

Screen  completed/awaiting 
sequencing 

T47D 

straight  lethal 

T=4 

Screen  completed/microarray 
analysis  completed 

TN/Basal  treatment 
category 


Hs578T 

straight  lethal 

T=7 

Screen  completed/awaiting 
sequencing 

MDAMB231 

straight  lethal 

T=4 

Screen  completed/awaiting 
sequencing 

MDAMB468 

straight  lethal 

T=4 

Screen  completed/awaiting 
sequencing 

MDAMB436 

straight  lethal 

T=4 

Screen  completed/microarray 
analysis  completed 

HCC1143 

straight  lethal 

T=4 

Screen  completed/microarray 
analysis  completed 

HCC1937 

straight  lethal 

T=4 

Screen  completed/microarray 
analysis  completed 

SUM149 

straight  lethal 

T=4 

Screen  completed/microarray 
analysis  completed 

SUM1315 

straight  lethal 

T=4 

Screen  completed/microarray 
analysis  completed 

Normal  cells 


Screen  completed/microarray 

HMEC 

straight  lethal 

T=4 

analysis  completed 

\J\le  have  completed  screens  in  JIMT-1  and  MDA-MB436  to  the  point  that  they  have 
been  fully  analyzed.  This  involved  the  development  of  custom  analysis  pipelines  that 
use  discrete  sequencing  counts  rather  than  our  more  traditional  microarray-derived 
datapoints.  We  immediately  noted  that,  unlike  screens  of  prior  libraries,  many  of  the 
genes  that  were  scored  as  positive  were  being  hit  with  multiple  shRNAs.  For  example, 
in  MDA-MB436,  3000  scoring  shRNAs  collapsed  to  only  1800  genes.  These  screens, 
which  were  carried  out  with  the  third-generation  library,  are  therefore  giving  much  more 
robust  data  than  we  have  ever  seen  before. 

We  are  currently  in  the  process  of  completing  analysis  of  the  remaining  screens  listed 
about  and  designing  methods  to  integrate  the  data  to  nominate  candidate  targets. 

One  explicit  goal  of  the  innovator  expansion  proposal  was  to  identify  pathways  that  are 
important  in  mammary  stem  cells.  It  has  become  increasingly  clear  that  tumor  cells,  like 
normal  cells,  are  driven  by  self-renewing  compartments  known  as  tumor-initiating  or 
cancer  stem  cell  populations.  These  cells  exhibit  higher  resistance  to  targeted  therapy 
than  the  rest  of  the  tumor  cell  population.  Thus  it  is  important  to  understand  the 
essential  pathways  that  drive  tumor-initiating  cells  and  how  these  signatures  differ  from 
those  that  enable  normal  breast  progenitor  cells  to  survive.  To  mark  progenitor  cells  in 
normal  mouse  mammary  epithelial  cells,  we  have  applied  an  approach  developed  by  E. 
Fuchs  laboratory  (Rockefeller  University)  to  identify/purify  label-retaining  cells  (LRCs) 
that  mark  the  skin  stem  cell  niche.  The  system  is  built  upon  the  premise  that  stem  cells 
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are  slow  cycling  and  active  for  the  keratin5  promoter.  It  utilizes  a  tetracycline-responsive 
promoter  driving  histone  2B-GFP  (H2B-GFP)  in  a  transgenic  mouse  model  expressing 
the  tet  repressor-VP16  (tTA)  transgene  from  the  K5  promoter.  In  the  absence  of 
doxycyline,  the  expression  of  GFP  is  high  in  epithelial  cells.  After  feeding  the  animal 
doxycycline  for  a  period  of  several  weeks  only  very  small  populations  of  epithelial  cells 
retain  GFP  fluorescence. 

Using  this  system  we  have  isolated  LRCs  from  mammary  epithelial  cells  and  have 
demonstrated  their  self-renewal  potential  both  in  vitro  (mammosphere  formation  assay) 
and  in  vivo  (reconstitution  of  mammary  gland).  Furthermore,  we  have  profiled  the  H2B- 
GFP+  cells  for  transcriptome,  methylome,  and  miRNA  expression  analysis.  Comparison 
of  the  transcriptomes  of  LRCs  and  other  subpopulations  of  the  mammary  gland  (luminal 
ductal  cells,  luminal  alveolar  cells,  luminal  progenitors,  myoepithelial  progenitors,  and 
myoepithelial  cells)  has  allowed  us  to  identify  new  a  cell  surface-specific  marker  for  the 
H2B-GFP+  cells,  CD1.  CD1 -specific  cell  populations  were  also  found  to  be  present  in 
human  basal  breast  cancer  cell  lines.  We  are  currently  testing  if  these  human  CD1- 
specific  tumor  cell  populations  have  self-renewing  capacity. 

To  find  essential  genes/survival  pathways  for  H2B-GFP+  breast  cancer  progenitor  cells, 
we  have  made  a  focused  shRNA  library  targeting  the  MaSC  genes,  which  are  highly 
expressed  in  the  H2B-GFP+  compartment  but  not  in  other  normal  mammary  cell  types. 
This  MaSC  shRNA  library  was  then  used  to  perform  a  well-by-well  RNAi  screen  to  test 
the  effect  of  each  individual  shRNA  on  the  survival  of  COMMA-D  cells  (normal  mouse 
mammary  epithelial  cell  line)  and  4T-1  cells  (mouse  mammary  basal-like,  metastatic  cell 
line).  Several  candidate  genes  from  this  screen  are  being  tested  to  determine  whether 
they  are  essential  for  survival  of  human  triple-negative  breast  cancer  cells. 

To  explore  the  role  of  epigenetics  in  cancer  cell  survival,  we  have  completed  another 
well-by-well  RNAi  screen  in  4T1  cells  using  a  collection  of  1,100  shRNAs  targeting  243 
genes  involved  in  chromatin  regulation.  BRD4,  a  gene  that  was  recently  identified  as  a 
therapeutic  target  in  acute  myeloid  leukemia  (C.  Vakoc,  CSHL),  was  one  of  the  top  hits 
(three  independent  shRNAs  were  identified).  We  are  now  testing  whether  any  of  the 
breast  cancer  subtypes  exhibit  sensitivity  towards  BRD4  inhibition. 

Finally,  we  identified  an  additional  hit,  BBTF,  which  is  an  epigenetic  regulator  upon 
which  breast  cancer  lines  seem  selectively  dependent  (unlike  BRD4,  which  scores  as  a 
hit  in  a  number  of  different  cancer  types).  BBTF  is  a  bromo-domain  protein  which  is 
likely  to  be  amenable  to  the  design  of  chemical  inhibitors  via  strategies  similar  to  those 
used  for  BRD4.  We  are  presently  validating  BBTF  as  a  hit  across  our  human  cell  line 
panel  and  verifying  that  it  is  also  essential  when  these  cells  form  tumors  in  vivo.  Once 
these  studies  are  completed,  we  will  search  for  partners  in  the  design  of  drugs  against 
this  molecule. 

Epigenetic  characterization  of  the  mammary  epithelial  lineage 
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Finally,  we  have  undertaken  a  full  epigenetic  characterization  of  the  mouse  mammary 
epithelial  lineage  from  nulliparous  and  parous  mice.  In  part,  our  goal  has  been  to 
understand  how  DNA  methylation  patterns  change  as  cells  differentiate  along  this 
lineage  and  to  understand  what  discriminates  stem  cells  from  their  more  mature 
progeny.  We  hope  to  use  this  information  in  reference  to  similar  profiles  of  breast 
cancer  cells  to  learn  something  about  the  role  of  epigenetics  in  the  process  of  tumor 
formation.  A  second  key  goal  is  to  ask  whether  mammary  cell  types  show  epigenetic 
changes  upon  pregnancy.  If  so,  our  hope  is  that  these  will  somehow  explain  the  strong 
protective  effect  of  early  pregnancy  against  the  development  of  breast  cancers.  Since 
this  protection  is  essentially  life-long,  it  is  not  difficult  to  imagine  that  some  shift  in  the 
nature  of  mammary  epithelial  cell  populations  might  underlie  this  observation. 

Thus  far,  we  have  completed  an  analysis  of  hypomethylated  regions  in  mammary 
epithelial  cells  isolated  from  virgin  mice  by  conventional  marker  strategies  and  H2B 
label  retaining  cells  (see  above).  All  that  is  missing  is  the  profiling  of  CD1 -positive  cells 
and  data  for  this  has  been  collected  but  not  analyzed.  We  see  highly  specific 
methylation  signatures  that  are  individual  to  the  stem  cell  and  to  each  lineage  and  note 
both  methylation  gains  and  losses  as  cells  differentiate. 

We  have  also  made  substantial  progress  on  the  analysis  of  parous  animals.  Though  this 
analysis  is  at  its  earliest  stages,  all  indications  are  that  each  mammary  cell  type  will 
show  major  shifts  in  methylation  patterns.  A  key  goal  for  the  next  year  is  to  complete  a 
manuscript  reporting  results  of  analysis  of  the  virgin  animals  and  to  complete 
comparisons  of  virgin  and  parous  mammary  glands. 

Reportable  outcomes 

A  fourth  generation  human  shRNA  library  comprising  -60,000  shRNAs  targeting 
-19,000  genes 

Conclusions 

We  continue  to  make  progress  toward  the  major  goals  of  this  application.  This  year, 
some  of  our  most  important  strides  have  been  in  understanding  mammary  epithelial 
biology  and  in  the  development  of  highly  optimized  shRNA  tools. 
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